CONTENTS

Chapter 1. Introducing Python

1.1 "And Now for Something Completely Different"

This book is about using Python, a very high-level, object-oriented, open source[1] programming language, designed to optimize development speed. Although it is completely general-purpose, Python is often called an object-oriented scripting language, partly because of its sheer ease of use, and partly because it is commonly used to orchestrate or "glue" other software components in an application.

If you are new to Python, chances are you've heard about the language somewhere, but are not quite sure what it is about. To help you get started, this chapter provides a nontechnical introduction to Python's features and roles. Most of it will make more sense once you have seen real Python programs, but let's first take a quick pass over the forest before wandering among the trees.

In the preface, I mentioned that Python emphasizes concepts such as quality, productivity, portability, and integration. Since these four terms summarize most of the reasons for using Python, I'd like to define them in a bit more detail:

Quality

Python makes it easy to write software that can be reused and maintained. It was deliberately designed to raise development quality expectations in the scripting world. Python's clear syntax and coherent design almost forces programmers to write readable code -- a critical feature for software that may be changed by others. The Python language really does look like it was designed, not accumulated. Python is also well tooled for modern software reuse methodologies. In fact, writing high-quality Python components that may be applied in multiple contexts is almost automatic.

Productivity

Python is optimized for speed of development. It's easy to write programs fast in Python, because the interpreter handles details you must code explicitly in lower-level languages. Things like type declarations, memory management, and build procedures are nowhere to be found in Python scripts. But fast initial development is only one component of productivity. In the real world, programmers must write code both for a computer to execute and for other programmers to read and maintain. Because Python's syntax resembles executable pseudocode, it yields programs that are easy to understand long after they have been written. In addition, Python supports (but does not impose) advanced paradigms such as object-oriented programming, which further boost developer productivity and shrink development schedules.

Portability

Most Python programs run without change on almost every computer system in use today. In fact, Python programs run today on everything from IBM mainframes and Cray Supercomputers to notebook PCs and handheld PDAs. Although some platforms offer nonportable extensions, the core Python language and libraries are platform-neutral. For instance, most Python scripts developed on Linux will generally run on Windows immediately, and vice versa -- simply copy the script over. Moreover, a graphical user interface (GUI) program written with Python's standard Tkinter library will run on the X Windows system, Microsoft Windows, and the Macintosh, with native look-and-feel on each, and without modifying the program's source code at all.

Integration

Python is designed to be integrated with other tools. Programs written in Python can be easily mixed with and script (i.e., direct) other components of a system. Today, for example, Python scripts can call out to existing C and C++ libraries, talk to Java classes, integrate with COM and CORBA components, and more. In addition, programs written in other languages can just as easily run Python scripts by calling C and Java API functions, accessing Python-coded COM servers, and so on. Python is not a closed box.

In an era of increasingly short development schedules, faster machines, and heterogeneous applications, these strengths have proven to be powerful allies in both small and large development projects. Naturally, there are other aspects of Python that attract developers, such as its simple learning curve for developers and users alike, libraries of precoded tools to minimize up-front development, and completely free nature that cuts product development and deployment costs.

But Python's productivity focus is perhaps its most attractive and defining quality. As I write this, the main problem facing the software development world is not just writing programs quickly, but finding developers with time to write programs at all. Developers' time has become paramount -- much more critical than execution speed. There are simply more projects than programmers to staff them.

As a language optimized for developer productivity, Python seems to be the right answer to the questions being asked by the development world. Not only can Python developers implement systems quickly, but the resulting systems will be maintainable, portable, and easily integrated with other application components.

1.2 The Life of Python

Python was invented around 1990 by Guido van Rossum, when he was at CWI in Amsterdam. Despite the reptiles, it is named after the BBC comedy series Monty Python's Flying Circus, of which Guido is a fan (see the following silly sidebar). Guido was also involved with the Amoeba distributed operating system and the ABC language. In fact, the original motivation for Python was to create an advanced scripting language for the Amoeba system.

But Python's design turned out to be general enough to address a wide variety of domains. It's now used by hundreds of thousands of engineers around the world, in increasingly diverse roles. Companies use Python today in commercial products, for tasks such as testing chips and boards, developing GUIs, searching the Web, animating movies, scripting games, serving up maps and email on the Internet, customizing C++ class libraries, and much more.[2] In fact, because Python is a completely general-purpose language, its target domains are only limited by the scope of computers in general.

Since it first appeared on the public domain scene in 1991, Python has continued to attract a loyal following, and spawned a dedicated Internet newsgroup, comp.lang.python, in 1994. And as the first edition of this book was being written in 1995, Python's home page debuted on the WWW at http://www.python.org -- still the official place to find all things Python.

What's in a Name?

Python gets its name from the 1970s British TV comedy series, Monty Python's Flying Circus. According to Python folklore, Guido van Rossum, Python's creator, was watching reruns of the show at about the same time he needed a name for a new language he was developing. And, as they say in show business, "the rest is history."

Because of this heritage, references to the comedy group's work often show up in examples and discussion. For instance, the name "Spam" has a special connotation to Python users, and confrontations are sometimes referred to as "The Spanish Inquisition." As a rule, if a Python user starts using phrases that have no relation to reality, they're probably borrowed from the Monty Python series or movies. Some of these phrases might even pop up in this book. You don't have to run out and rent The Meaning of Life or The Holy Grail to do useful work in Python, of course, but it can't hurt.

While "Python" turned out to be a distinctive name, it's also had some interesting side effects. For instance, when the Python newsgroup, comp.lang.python, came online in 1994, its first few weeks of activity were almost entirely taken up by people wanting to discuss topics from the TV show. More recently, a special Python supplement in the Linux Journal magazine featured photos of Guido garbed in an obligatory "nice red uniform."

There's still an occasional post from fans of the show on Python's news list. For instance, one poster innocently offered to swap Monty Python scripts with other fans. Had he known the nature of the forum, he might have at least mentioned whether they ran under DOS or Unix.

To help manage Python's growth, organizations aimed at supporting Python developers have taken shape over the years: among them, Python Software Activity (PSA) was formed to help facilitate Python conferences and web sites, and the Python Consortium was formed by organizations interested in helping to foster Python's growth. Although the future of the PSA is unclear as I write these words, it has helped to support Python through the early years.

Today, Guido and a handful of other key Python developers, are employed by a company named Digital Creations to do Python development on a full-time basis. Digital Creations, based in Virginia, is also home to the Python-based Zope web application toolkit (see http://www.zope.org). However, the Python language is owned and managed by an independent body, and remains a true open source, community-driven system.

Other companies have Python efforts underway as well. For instance, ActiveState and PythonWare develop Python tools, O'Reilly (the publisher of this book) and a company named Foretech both organize annual Python conferences, and O'Reilly manages a supplemental Python web site (see the O'Reilly Network's Python DevCenter at http://www.oreillynet.com/python). The O'Reilly Python Conference is held as part of the annual Open Source Software Convention. Although the world of professional organizations and companies changes more frequently than do published books, it seems certain that the Python language will continue to meet the needs of its user community.

1.3 The Compulsory Features List

One way to describe a language is by listing its features. Of course, this will be more meaningful after you've seen Python in action; the best I can do now is speak in the abstract. And it's really how Python's features work together, that make it what it is. But looking at some of Python's attributes may help define it; Table 1-1 lists some of the common reasons cited for Python's appeal.

Table 1-1. Python Language Features

Features

Benefits

No compile or link steps

Rapid development cycle turnaround

No type declarations

Simpler, shorter, and more flexible programs

Automatic memory management

Garbage collection avoids bookkeeping code

High-level datatypes and operations

Fast development using built-in object types

Object-oriented programming

Code reuse, C++, Java, and COM integration

Embedding and extending in C

Optimization, customization, system "glue"

Classes, modules, exceptions

Modular "programming-in-the-large" support

A simple, clear syntax and design

Readability, maintainability, ease of learning

Dynamic loading of C modules

Simplified extensions, smaller binary files

Dynamic reloading of Python modules

Programs can be modified without stopping

Universal "first-class" object model

Fewer restrictions and special-case rules

Runtime program construction

Handles unforeseen needs, end-user coding

Interactive, dynamic nature

Incremental development and testing

Access to interpreter information

Metaprogramming, introspective objects

Wide interpreter portability

Cross-platform programming without ports

Compilation to portable bytecode

Execution speed, protecting source code

Standard portable GUI framework

Tkinter scripts run on X, Windows, and Macs

Standard Internet protocol support

Easy access to email, FTP, HTTP, CGI, etc.

Standard portable system calls

Platform-neutral system scripting

Built-in and third-party libraries

Vast collection of precoded software components

True open source software

May be freely embedded and shipped

To be fair, Python is really a conglomeration of features borrowed from other languages. It includes elements taken from C, C++, Modula-3, ABC, Icon, and others. For instance, Python's modules came from Modula, and its slicing operation from Icon (as far as anyone can seem to remember, at least). And because of Guido's background, Python borrows many of ABC's ideas, but adds practical features of its own, such as support for C-coded extensions.

1.4 What's Python Good For?

Because Python is used in a wide variety of ways, it's almost impossible to give an authoritative answer to this question. In general, any application that can benefit from the inclusion of a language optimized for speed of development is a good target Python application domain. Given the ever-shrinking schedules in software development, this a very broad category.

A more specific answer is less easy to formulate. For instance, some use Python as an embedded extension language, while others use it exclusively as a standalone programming tool. And to some extent, this entire book will answer this very question -- it explores some of Python's most common roles. For now, here's a summary of some of the more common ways Python is being applied today:

System utilities

Portable command-line tools, testing systems

Internet scripting

CGI web sites, Java applets, XML, ASP, email tools

Graphical user interfaces

With APIs such as Tk, MFC, Gnome, KDE

Component integration

C/C++ library front-ends, product customization

Database access

Persistent object stores, SQL database system interfaces

Distributed programming

With client/server APIs like CORBA, COM

Rapid-prototyping /development

Throwaway or deliverable prototypes

Language-based modules

Replacing special-purpose parsers with Python

And more

Image processing, numeric programming, AI, etc.

"Buses Considered Harmful"

The PSA organization described earlier was originally formed in response to an early thread on the Python newsgroup, which posed the semiserious question: "What would happen if Guido was hit by a bus?"

These days, Guido van Rossum is still the ultimate arbiter of proposed Python changes, but Python's user base helps support the language, work on extensions, fix bugs, and so on. In fact, Python development is now a completely open process -- anyone can inspect the latest source-code files or submit patches by visiting a web site (see http://www.python.org for details).

As an open source package, Python development is really in the hands of a very large cast of developers working in concert around the world. Given Python's popularity, bus attacks seem less threatening now than they once did; of course, I can't speak for Guido.

On the other hand, Python is not really tied to any particular application area at all. For example, Python's integration support makes it useful for almost any system that can benefit from a frontend, programmable interface. In abstract terms, Python provides services that span domains. It is:

Given these general properties, Python can be applied to any area we're interested in by extending it with domain libraries, embedding it in an application, or using it all by itself. For instance, Python's role as a system tools language is due as much to its built-in interfaces to operating system services as to the language itself. In fact, because Python was built with integration in mind, it has naturally given rise to a growing library of extensions and tools, available as off-the-shelf components to Python developers. Table 1-2 names just a few; you can find more about most of these components in this book or on Python's web site.

Table 1-2. A Few Popular Python Tools and Extensions

Domain

Extensions

Systems programming

Sockets, threads, signals, pipes, RPC calls, POSIX bindings

Graphical user interfaces

Tk, PMW, MFC, X11, wxPython, KDE, Gnome

Database interfaces

Oracle, Sybase, PostGres, mSQL, persistence, dbm

Microsoft Windows tools

MFC, COM, ActiveX, ASP, ODBC, .NET

Internet tools

JPython, CGI tools, HTML/XML parsers, email tools, Zope

Distributed objects

DCOM, CORBA, ILU, Fnorb

Other popular tools

SWIG, PIL, regular expressions, NumPy, cryptography

1.5 What's Python Not Good For?

To be fair again, some tasks are outside of Python's scope. Like all dynamic languages, Python (as currently implemented) isn't as fast or efficient as static, compiled languages like C. In many domains, the difference doesn't matter; for programs that spend most of their time interacting with users or transferring data over networks, Python is usually more than adequate to meet the performance needs of the entire application. But efficiency is still a priority in some domains.

Because it is interpreted today,[3] Python alone usually isn't the best tool for delivery of performance-critical components. Instead, computationally intensive operations can be implemented as compiled extensions to Python, and coded in a low-level language like C. Python can't be used as the sole implementation language for such components, but it works well as a frontend scripting interface to them.

For example, numerical programming and image processing support has been added to Python by combining optimized extensions with a Python language interface. In such a system, once the optimized extensions have been developed, most of the programming occurs at the higher-level Python scripting level. The net result is a numerical programming tool that's both efficient and easy to use.

Moreover, Python can still serve as a prototyping tool in such domains. Systems may be implemented in Python first, and later moved in whole or piecemeal to a language like C for delivery. C and Python have distinct strengths and roles; a hybrid approach, using C for compute-intensive modules, and Python for prototyping and frontend interfaces, can leverage the benefits of both.

In some sense, Python solves the efficiency/flexibility tradeoff by not solving it at all. It provides a language optimized for ease of use, along with tools needed to integrate with other languages. By combining components written in Python and compiled languages like C and C++, developers may select an appropriate mix of usability and performance for each particular application. While it's unlikely that it will ever be as fast as C, Python's speed of development is at least as important as C's speed of execution in most modern software projects.

On Truth in Advertising

In this book's conclusion we will return to some of the bigger ideas introduced in this chapter, after we've had a chance to study Python in action. I want to point out up front, though, that my background is in Computer Science, not marketing. I plan to be brutally honest in this book, both about Python's features and its downsides. Despite the fact that Python is one of the most easy-to-use programming languages ever created, there are indeed some pitfalls, which we will examine in this book.

Let's start now. Perhaps the biggest pitfall you should know about is this one: Python makes it incredibly easy to throw together a bad design quickly. It's a genuine problem. Because developing programs in Python is so simple and fast compared to traditional languages, it's easy to get wrapped up in the act of programming itself, and pay less attention to the problem you are really trying to solve.

In fact, Python can be downright seductive -- so much so that you may need to consciously resist the temptation to quickly implement a program in Python that works, and is arguably "cool," but leaves you as far from a maintainable implementation of your original conception as you were when you started. The natural delays built in to compiled language development -- fixing compiler error messages, linking libraries, and the like -- aren't there to apply the brakes in Python work.

This isn't necessarily all bad. In most cases, the early designs that you throw together fast are stepping stones to better designs that you later keep. But be warned: even with a rapid development language like Python, there is no substitute for brains -- it's always best to think before you start typing code. To date, at least, no computer programing language has managed to make intelligence obsolete.

[1]  Open source systems are sometimes called freeware, in that their source code is freely distributed and community-controlled. Don't let that concept fool you, though; with roughly half a million users in that community today, Python is very well supported.

[2]  See the preface for more examples of companies using Python in these ways, and see http://www.python.org for a more comprehensive list of commercial applications.

[3]  Python is "interpreted" in the same way that Java is: Python source code is automatically compiled (translated) to an intermediate form called "bytecode," which is then executed by the Python virtual machine (that is, the Python runtime system). This makes Python scripts more portable and faster than a pure interpreter that runs raw source code or trees. But it also makes Python slower than true compilers that translate source code to binary machine code for the local CPU. Keep in mind, though, that some of these details are specific to the standard Python implementation; the JPython (a.k.a. "Jython") port compiles Python scripts to Java bytecode, and the new C#/.NET port compiles Python scripts to binary .exe files. An optimizing Python compiler might make most of the performance cautions in this chapter invalid (we can hope).

CONTENTS